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Abstract 

The objective of this paper is to develop methods for solving image recovery problems subject to 
constraints on the solution. More precisely, we will be interested in problems which can be formulated as 
the minimization over a closed convex constraint set of the sum of two convex functions / and g, where 
/ may be non-smooth and g is differentiate with a Lipschitz-continuous gradient. To reach this goal, 
we derive two types of algorithms that combine forward-backward and Douglas-Rachford iterations. 
The weak convergence of the proposed algorithms is proved. In the case when the Lipschitz-continuity 
property of the gradient of g is not satisfied, we also show that, under some assumptions, it remains 
possible to apply these methods to the considered optimization problem by making use of a quadratic 
extension technique. The effectiveness of the algorithms is demonstrated for two wavelet-based image 
restoration problems involving a signal-dependent Gaussian noise and a Poisson noise, respectively. 

1 Introduction 

Wavelet decompositions |34j proved their efficiency in solving many inverse problems. More recently, frame 
representations such as Bandlets [32] , Curvelets [IT], Grouplets [33] or dual-trees [H1[T5] have gained much 
popularity. These linear tools provide geometrical representations of images and they are able to easily 
incorporate a priori information (e.g. via some simple statistical models) on the data. Variational or 
Bayesian formulations of inverse problems using such representations often lead to the minimization of 
convex objective functions including a non-differentiable term having a sparsity promoting role [13[ 1381 [3j 

In restoration problems, the observed data are corrupted by a linear operator and a noise which is not 
necessarily additive. To solve this problem, one can adopt a variational approach, aiming at minimizing 
the sum of two functions / and g over a convex set C in the transform domain. Throughout the paper, / 
and g are assumed to be in the class Tq(7{) of lower semicontinuous convex functions taking their values 
in ] — oo, Too] which are proper (i.e. not identically equal to +oo) and defined on a real separable Hilbert 
space H. Then, our objective is to solve the following: 
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Problem 1.1 Let C be a nonempty closed convex subset of TL. Let f and g be in Tq(TL), where g is 
differ entiable on TL with a (3-Lipschitz continuous gradient for some (3 G ]0, +00 [. 



Find 



mmf(x)+g(x). 

16C 



Problcm ll.ll is equivalent to minimizing / + g + lq, where lq denotes the indicator function of C, i.e. 



Up to now, many authors devoted their works to the unconstrained case, i.e. C — TL. So-called thresholded 

Landweber algorithms belonging to the more general class of forward-backward optimization methods were 

proposed in j28l El [22l [9] in order to solve the problem numerically. Daubechies et al. [22] investigated the 

convergence of these algorithms in the particular case when g is a quadratic function and / is a weighted 

^p-norm with p G [1,2]. These approaches were put into a more general convex analysis framework in 

[2"0] and extended to frame representations in [2]. Attention was also paid to the improvement of the 

convergence speed of the forward-backward algorithm in [7], for some specific choices of / and g. In [45] . 

an accelerated method was suggested in the specific case of a deconvolution in a Shannon wavelet basis. 

Then, a Douglas-Rachford algorithm relaxing the assumption of differentiability of g was introduced in 

|18j . In recent works [23l [24] , a variational approach, which is grounded on a judicious use of the Anscombc 

transform, was developed for the deconvolution of data contaminated by Poisson noise. A modification of the 

forward-backward algorithm was subsequently proposed in finite dimension in order to solve the associated 

optimization problem. Additional comments concerning this approach will be given in Sections 13.2.21 and 

15.41 A key tool in the study of the aforementioned methods is the proximity operator introduced by Moreau 
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in 1962 [36] [37]. The proximity operator of / G Tq(TL) is prox^ : TL — > TL: x 1— > argmin ye -H - \\y — x\\ +f(y). 

We thus see that prox tc reduces to the projection Pc onto the convex set C . The function / in Problem ll.il 
may be non-smooth and, actually, it is often chosen as an ^ 1 -norm, in which case its proximity operator 
reduces to a componentwise soft-thresholding [20]. In [19], the authors derived the concept of proximal 
thresholding by considering a larger set of non-differentiable convex functions. 

The goal of this paper is to propose iterative algorithms allowing us to solve Problcm ll.il when C ^TL. 
The relevance of the proposed methods is shown for image recovery problems where convex constraints on 
the solution need to be satisfied. 

In Section [2] we start by recalling some properties of the proximity operator. Then, in Section [3] we 
briefly describe the forward-backward and Douglas-Rachford methods. As the proximity operator of the 
sum of the indicator function of a convex set and a function in Tq(TL) cannot be easily expressed in general, 
we propose two iterative methods to compute this operator: the first one is a forward-backward algorithm, 
whereas the second one is a Douglas-Rachford algorithm. We also investigate the specific convergence 
properties of these two algorithms. In Section [4] we derive two iterative methods to solve Problem 11.11 
and their convergence behaviours arc studied. Finally, in Section (5[ these algorithms are applied to a 
class of image recovery problems. In this case, the Lipschitz-continuity property of the gradient of g is 
not satisfied in the considered maximum a posteriori criterion. To overcome this difficulty, a quadratic 
extension technique providing a lower approximation of the objective function is introduced. Numerical 
results concerning deconvolution problems in the presence of signal-dependent Gaussian noise or Poisson 
noise are then provided. 




0, if x G C; 
+00, otherwise. 
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2 Some properties of proximity operators 



As already mentioned, the proximity operator of lc + / plays a key role in our approach. Some useful results 
for the calculation of prox tc+ ^ are first recalled. Subsequently, the domain of a function / : Tt — >] — oo, +00] 
is denoted by dom/ = {x G Tt \ f(x) < +00}. 

Proposition 2.1 [TH Proposition 12] Let f G T n (Tt) and let C be a closed convex subset of H such that 
C n dom / 7^ . Then the following properties hold. 

(i) (Vx G TL), proxy-x G C => prox, c+ ja; = prox^x 

(ii) Suppose that TL = K. Then 

prox tc+/ = P c o proxj. (1) 

Note that, the second part of this proposition can be generalized, yielding the following result which 
appears also as an extension of [HJ Proposition 2.10] when C ^ TL: 

Proposition 2.2 Let K. be a nonempty subset ofN, (ok)keK be an orthonormal basis ofTL and (<Pk)keK be 
functions in Tq(M.). Set 

f: TL -> ]-oo,+oo] : x i-> <Pk((x, Qfc))- (2) 

keK 

Let 

C= f]{xe TL I (x,o k ) G C k } (3) 

where (Ck)keK are closed intervals in IR such that (Vfc G K) Cfc fl dom^^ 7^ 0. 
Suppose that either K is finite, or there exists a subset L o/K such that: 

(i) K \ L is finite; 

(ii) (Vfc G L) ipk > V? fc (0) = and G C fc . 



whe 



(Vac G H) prox, c+/ a; = 7^0* (4) 



el 



inf C fc i/ prox^ k (x,Ok) < mf C fc 

7r fc = <^ sup Cfc ifprox (pk (x,o k ) > supC fc (5) 

prox yfc (x,Ok) otherwise. 



Proof. Due to the form of / and C, one can write, 

(Vz G Tt) (/ + tc)(ar) = + ^J«z= °*»- 

fcSK 

For every k G K, ^ + ic fr G r (R) since ifk G r (IR) and Cfc is assumed to be a closed convex set having 
a nonempty intersection with domyjfc. If K is not finite, in view of Assumption (ii) we have (Vfc G L) 
fk + 'c fc > {<Pk + *c fc )(0) = 0. From [T4l Remark 3.2(h) and Proposition 2.10], it can be deduced that 

(Vx G ft) prox /+tc x = (prox^fc+.Cfc ( x ' fe))°fc- ( 6 ) 

a- ex 
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On the other hand, since for every k G K, Ck is a closed interval in R such that Cfc n dom tp k ^ 0, it follows 



from Proposition 1271 fii) that 

prox Vfc+lCfc (x,o fe ) =(P Cj , oprox VJ J((x,o fe )) 

ii C k , if prox Vfc (x, o k ) < inf C fe 

P rox Vfc fa) °fc> > if P rox Vfc °0 e Cfc ( 7 ) 
ip C fc , if prox tpfc (x, o k ) > sup C fc . 

Combining © and Q yields (g]) and ©. □ 

A function / (resp. convex C) satisfying j2]) (rcsp. j3])) will be said separable. Note that (j4|) and ([5|) 
imply that ([T]) holds. However, this relation has been proved under the restrictive assumption that both / 
and C are separable. In general, when cither / or C is not separable, (|TJ) is no longer valid. Let us give 
two simple counterexamples to illustrate this fact. 

Example 2.3 Let H = R 2 and f be the function defined by (Vx G K 2 ) /(x) = ^x T Ax with A = 
1,2 ' where A 2 , 2 > and \A 1>2 \ < A l 2 {%. Let C = [-1,1] 2 . This convex set is separable w.r.t. 



Al,2 A2,2^ 

the canonical basis of R 2 . 

Now, set x = 2(Ai 2j 1 + A 2 , 2 ) T - Aft er some calculations (see Avvendix Vfi\) , one obtains: 



• P c (wox f x) = (0,1) T 

• prox tc+ yX = (-7T, 1) T where 



A -^ ifA 1>2 € [-2,2] 

«/Ai, 2 >2 (8) 
1 i/Ai 2 <-2. 



We conclude that ([T]) is no£ satisfied as soon as Ai^ ^ ; </iat is / is noi separable. 

Example 2.4 Let W = R 2 . Consider the separable function defined by (Vx = (xW,x( 2 )) T G R 2 ) /(x) = 
(1 + Ai, 2 )(xW) 2 + (1 - Ai j2 )(x( 2 )) 2 where < |Ai, 2 | < 1. Let the nonseparable convex set C be defined by 

C = {x= (x (1) ,x (2) ) T el 2 | max(|x (1) -x^|,|x (1) +x (2) |) < V2}. 
In this case, it is shown in AvvendixW\ that ([!]) does not hold. 



In summary, for an arbitrary function in Tq(H) and an arbitrary closed convex set, we cannot trust ([T]) 
to determine the proximity operator of the sum of this function and the indicator function of the convex 
set. In the next section, we will propose efficient approaches to compute the desired proximity operator in 
a general setting. 

Other more classical properties of the proximity operator which will be used in the paper are provided 
in the sequel. 

Proposition 2.5 

(i) If f = h + k(-,x) where h G To(Ti), x G TL and k G R, then prox^- = prox ?l (- — kx). 

(ii) If f = h + • j| 2 /2 where h G T (H) and <& G ]0, +oo[, then 
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(a) pro X/ = prox (1+tf) _i ft ( • /(l + #)) 

(b) (V(y, z) £W 2 ) (proxjy - prox/2, y - 2) > (1 + tf)||prox/y- prox^y 5 

(c) proxy is strictly contractiv^ with constant (1 



Proof. Properties (i) and (ii)(a) result from straightforward calculations [20j Lemma 2.6]. (ii) (b) follows 
from the fact that pvox^ 1+ ^-i h is firmly nonexpansive |20( Lemma 2.4], i.e. 

(V(y, z) £ Ti 2 ) (prox h y — prox h z,y — z) > ||prox h y — prox h zll 2 . 



Thus, by using (ii)(a) we have 

(V(y, z) G H 2 ) (proxy?/ - proxy z, y - z) 
= (l + ^)^ P rox T ^( T A 



prox h 



>(l+0) 



prox h 



y 



i+?Vl + i9/' l + ?9 l + ?9 
2 



— VI + 1? 
1 + z9)||proxy?/ — proxjz|| 2 



prox 



Property (ii) (c) can then be deduced, by invoking the Cauchy-Schwarz inequality: 

(V(y, z) E U 2 ) (1 + d)\\prox f y - prox^] 2 < (prox^y - prox f z,y- z) 

< llproxyy-proxyzllHy-zl 

□ 



Recall that a function / G ro(W) satisfying the assumptions in (ii) is said to be strongly convex with 
modulus •&. 

Proposition 2.6 [T51 Proposition 11] Let Q be a real Hilbert space, let f G rrj((/), and let L: H ^ G be 

a bounded linear operator. Suppose that the composition of L and L* satisfies L o L* = vld. for some 
v G ]0, +00 [. Then foLeT (H) and 

prox foL = Id + v~ l L* o (prox„y — Id) o L. (9) 



3 Iterative solutions to the minimization of a sum of two convex 
functions 

3.1 Forward-backward approach 

Consider the following optimization problem, which is a specialization of Problem 11.11 

Problem 3.1 Let fi and fi be two functions in Tq(TI) such that Argmin/i + /2 7^ and fi is differentiable 
on H with a [3-Lipschitz continuous gradient for some (3 G ]0, +00 [. 

Find min/i(a;) + fa(x). 



As mentioned in the introduction, the forward-backward algorithm is an effective method to solve the 
above problem. 



'An operator is strictly contractive with constant (3 if it is /3-Lipschitz continuous and j3 S]0, 1[. 
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3.1.1 Algorithm [201, Eq.(3.6)] 



Let Xq € Ti be an initial value. The algorithm constructs a sequence (x n ) n ^jq by setting, for every n G N, 

x n+ i = x n + A„ (prox 7n/i (x n - 7„V/ 2 (x„) + b n ) + a n - x n ) (10) 

where j n > is the algorithm step-size, X n > is a relaxation parameter and a n € Ti. (resp. b n G Ti) 
represents an error allowed in the computation of the proximity operator (resp. the gradient). The weak 
convergence of (x n ) n eN to a solution to Problem 13. II is then guaranteed provided that: 

Assumption 3.1 

(i) < 7 < 7 < 2/3 -1 where 7 = inf„ eN j n and 7 = sup ngN j n . 

(ii) (Vn G N) < A < A„ < 1. 

(iii) E„gn || an || < +00 and £„ eN \\b n \\<+oo. 

More details concerning this algorithm can be found in [20l [14] and conditions for the strong convergence of 
the algorithm arc also given in |19| . An additional result which will be useful in this paper is the following: 



Lemma 3.2 Suppose that Assumvtions\3. Wij\ and (H) as well as the assumptions of Problem VS. 1\ hold. If f 



is a strongly convex function with modulus i9 ; then the forward-backward algorithm in (JTUJ) with a n = b n = 
converges linearly to the unique solution x to Problem \3.1\ More precisely, we have 

(VneN) || Sn _x||<(l_-^-) n || a:o _2||. (11) 

Proof. Since Argmin f\ + fi 7^ and f\ is strongly (thus strictly) convex, there exists a unique minimizcr 
x of fx + f%. Then, x is db fixed point of the forwRrd-bcickward cilgoritlim in (jlOp when o, n = b n = 0. Thus, 
we have, for all neN, 

x n+1 - x = (1 - Xn)(x n -x) + A„(prox 7n/i (x„ - 7„V/ 2 (o;„)) - prox 7n/l (x - 7„V/ 2 (x))) 

which yields 

\\x n+1 - x\\ < (1 - X n )\\x n - x\\ 

+ A„||prox 7ji/i (x n - 7„V/ 2 (a; n )) - prox 7n/i (x - 7nV/ 2 (x))||. 

Since fx has been assumed strongly convex with modulus 7n/i is strongly convex with modulus 7„$ and, 
according to Assumption I3.i|[i)[ it is also strongly convex with modulus 7??. We deduce from Proposition 



2.^(ii)(c) that prox 7re y 1 is strictly contractive with constant (1 + jS) . Hence, we have 



\\x n+ x - x\\ < (1 - A„)||a;„ - x\\ + A " \\x n - 7„V/ 2 (.t„) -x + 7 n V/ 2 (x)||. 

1 + 717 

Recall that an operator R : Ti — > H is noncxpansive if (V(y,z) G Ti 2 ) \\R{x) — i?(y)|| < ||x — y\\. An 
operator T : Tt — > Ti. is a-averagcd with a g]0, 1[ if T = (1 — a)Id + aR where R is a nonexpansive operator. 

Since / 2 is a diffcrcntiablc convex function having a /3-Lipschitz continuous gradient with (3 > 0, we 
deduce from the Baillon-Haddad theorem g], that V/ 2 //3 is 1/2-average. As 7„ G]0,2//3[ , by using [171 
Lemma 2.3], Id — 7„V/ 2 is ^^-averaged and it is therefore nonexpansive (see [T7J Lemma 2.1 (ii)] ) . 
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This entails that 

\\x„ - 7nV/ 2 (a;„) - x + j n Vf 2 (x)\\ < ||a; n -x|| 

and, consequently, 

\\x n+1 - x\\ < (l - y^p^) II s " - *\\ < (i ~ i^^) H x « " *ll 
which results in ITT1). □ 



The linear convergence of the forward-backward algorithm was also proved in [5J 116) under different 
assumptions. 



3.1.2 Computation of prox tc+Kg 

Let k > and g be a differentiable function with /3-Lipschitz continuous gradient where (3 G ]0, +oo[. Let 
C be a closed convex set such that C ^ 0. Then, for every x £ H, the determination of prox tc+Kg x can 
be viewed as a minimization problem of the form of Problem 13.11 Indeed, by using the definition of the 
proximity operator, we have: 



(Vr G H) prox Kg+Lc x = argmin i \\y - x\\ 2 + ng{y) + L C {y). 



Now, we can set /i = 5 ||. — x\\' 2 + lq and f 2 = Kg. The proximity operator of j n fi with j n G ]0, +00 [, is 
the proximity operator of • || 2 — 7n( - , a;) + tcj which is straightforwardly deduced from Proposition 1 2.^1)] 



and (ii) (a) 



(VyeH) P rox Wl2/ = P C (^^). (12) 

whereas / 2 has a K/3-Lipschitz continuous gradient. In this case, by setting a n = b n = in Algorithm (fTU)l . 
we get 

(Vn G N) = as n + A„ I Pc y j— J -x n \ (13) 

with 

< 7 < 7 „ < 7 < 2k" 1 /?" 1 - (14) 
The obtained algorithm possesses the following properties: 



Proposition 3.3 Suppose that Condition (|14[) and Assumption \ 3.1i\iij\ hold. Consider the algorithm in 
(fT3]) where x G H. Then, 

(i) we have: 

(Vn G N) ||z„ - prox tc+Kg a;|| < p n \\x - prox tc+Kff a:|| (15) 

where 

P = 1 -Tti> (16) 

(ii) &j/ setting xq = prox Kg x, we get: 

prox Kg x EC (Vn G N) x n = prox LC+lig x. (17) 
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/ A7 \ " 

(Vn 6 N) |jx„ - prox tc+Kg a-|| < {1 - ^-j=- j ||prox Kg a; - prox, c+Kg a;|| = (18) 



Proof, (i) : As /1 is strongly convex with modulus 1. (|15[1 is obtained by invoking Lemma [ 
(ii) : If xq — prox Kg x G C, then (fT5|) leads to 

1' 

where Proposition 12. l|Fi)| has been used in the last equality. This shows that (fT7|) is satisfied. □ 
Remark 3.4 

(i) .Eg. (|15|) shows that (x n ) n ^ converges linearly to prox tc+Kg ;c. Although this equation provides an 
upper bound, it suggests to choose X n and 7„ as large as possible (i.e. A n = 1 and j n close to 
Zk to opt imize the convergence rate. This fact was confirmed by our simulations. 



(ii) Proposition \3.3Q(ii)\ may appear as a desirable property since Proposition \2.tf(i)\ states that, when 
prox Kg x G C , prox tc+Kg 2; takes a trivial form. In this case, the convergence is indeed guaranteed 
in just one iteration by appropriately initializing the algorithm. Note however that prox Kg a; may not 
always be simple to compute, depending on the form of g. 

(iii) An alternative numerical method for the computation of prox (c+Kg .T would consist of setting f\ = lq 
and f 2 = \ || . — £|| 2 + Kg in the forward-backward algorithm, so yielding 

{Vn G N) x n+1 = x„ + X n (P c (x n - 7„(KVg(x„) + x„ - a;)) - x n ) 

with < 7 < 7 < 2(k(3 + l)" 1 . It can be noticed that the forward-backward algorithm then reduces to 
a projected gradient algorithm [51 Chap. 3., Sect. 3.3.2] [T], when A„ = 1. In our experiments, it was 
however observed that the convergence of this algorithm is slower than that in (|13p . probably due to 
the fact that prox™^ is no longer strictly contractive for the second choice of f\. 



3.2 Douglas-Rachford approach 

Let us relax the Lipschitz continuity assumption in Problem 13.11 and turn our attention to the optimization 
problem: 

Problem 3.2 Let g\ and gi be functions in To(Tl) such that Argmingi + 92 7^ 0- Assume that one of the 
following three conditions is satisfied: 

(i) dom g 2 H int dom gi 7^ H 

(ii) dom gi n int dom g 2 =/= 0. 

(iii) Ti. is finite dimensional and rint dom g\ H rint dom g 2 =/= . 

Find mingi(z) + g 2 (z). 

In the statement of the above problem, the notation differs from that used in Problem 13.11 to emphasize 
the difference in the assumptions which have been adopted and facilitate the presentation of the algorithms 
subsequently presented in Section 2] 

The Douglas-Rachford algorithm, proposed in |33U25| . provides an appealing numerical solution to Problem 
13.21 as described next. 



2 The interior (resp. relative interior) of a set S of Ti. is designated by intS (resp. rintS 1 ). 
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3.2.1 Algorithm QH Eq.(19)] 



Set zq £ Ti and compute, for every m £ N, 

Z m +1 = z m + T m (prox Kgi (2z m+ i - z m ) +a m - z m+ i) 

where k > 0, (r m ) mgN is a sequence of positive reals, and (a m ) mgN (resp. (6 m ) meN ) is a sequence of errors 
in Ti. allowed in the computation of the proximity operator of k<?i (resp. ftg 2 )- 

Then, (z m )„ lS N converges weakly to z £ H [T71 Corollary 5.2] such that prox Kg2 z is solution to Problem l3.2[ 
provided that: 

Assumption 3.5 

(i) (Vm £ N) r m G]0, 2[ and £ meN r ™( 2 - r m) = +°°- 

(li) y] m gfi[ T m ( 1 1 flyn, 1 1 -+- \\b m < +oo. 



An alternate convergence result is the following: 

Proposition 3.6 Suppose that the assumptions of Problem \3.2\ hold. If gi is a strongly convex function, 
then the Douglas- Rachford algorithm in (|19p with inf mS N r m > 0, sup mgN r m < 2 and a m = b m = is such 
that (z m+1 / 2 )meN converges strongly to the unique solution to Problem \3.2[ 

Proof. Let the rprox operator be defined, for every / £ Tq(TI), by 

rproxj = 2prox^- — Id. (20) 
Let us rewrite the Douglas-Rachford iteration in (|19p with a m = b m = as z m+ i = S m z m , where 

S m = r m prox K9i (rprox Kga ) + Id - r m prox Kg2 . (21) 

For all {y,y') £ Ti 2 , we have 

\\S m y - S m y'\\ 2 = T 2 J|prox Kgi (rprox Kg2 w) - prox Kgi (rprox Kg2 ?/)l| 2 
+ 2r m (prox Kgi (rprox Kg2 y) - prox Kgi (rprox^y'), y - T m prox Kfl2 y - y' + r ro prox reS2 y') (22) 
+ \\y- T m prox Kg2 y -y' + r m prox Kg2 ?/'|| 2 . 

Since ng± £ Tq(H), prox Kgi is firmly nonexpansive [201 Lemma 2.4] and, the expression in (|22[) can be upper 
bounded as follows 

\\S m y - S m y'\\ 2 

< <prox Kffi (rprox Kff2 y) - prox Kgi (rprox reg2 ?/), rprox„. ff2 y - rprox Kg2 y') 
+2r m (prox Kgi (rprox Kg2 y) - prox Kfll (rprox Kg2 y'),y - r m prox Kfl2 j/ - y' + T m prox Kg2 y') 
+ \\y- T m prox Kg2 y -y' + r m prox Kg2 y'|| 2 

which yields after simplifications: 

\\S m y - S m y'\\ 2 < r m (2 - r m ) (prox Kgi (rprox Kg2 y) - prox Kgi (rprox Kff2 y'), V ~ y') 

+ \\y- TmPi'Ox K92 y - y' + r m prox Kg2 j/'|| 2 . 
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Using the definition of the operator S m in (|2ip . we thus obtain, after some simple calculations, 

\\S m y - S m y'\\ 2 < (2 - r m ) (S m ?/ - S m y',y- y') + (r m - l)||y - t/|| 2 

- T m( <P r ox K92 y - prox Kg2 y', y-y')- ||prox Kg2 ?/ - prox Kg2 z/|| 2 ) . (23) 

Let 9 be the modulus of the strongly convex function g 2 . Then ng 2 is strongly convex with modulus kO and 
Proposition 12. £f(ii)(b) states that the following inequality holds: 

{p mx Kg 2 y - w° x Kg2 y' ,y - v') > ( k9 + ^Wwox^y - wox Kg2 y'\\ 2 , 

which combined with ([23]) leads to 

\\S m y - S m y'\\ 2 +KeT^\\prox Kg2 y - prox Kg2 z/|| 2 

< (2 - r m ) (S m y - S m y', y - if) + ( 



J m - — 2/' 



(24) 



Now, let z be the unique minimizcr of g± + g 2 . Hence, z = prox K59 z where z is a fixed point of S m . 
Consequently, by setting y = z m and y' = z in (|24[) . we deduce that 

||z m +l - Z\\ 2 + K6T^\\z m+ l - Z|| 2 

< (2 - r m ) (z m+ i - z,z m - z) + (r m - l)||z m - z|| 2 . (25) 

Using the fact that 

2 (z m+ i - z,z m - z) = \\z m+ i - z\\ 2 + \\z m - z\\ 2 - \\z m+1 - z m \\ 2 
(l25l) can be rewritten as 



T m \\z m+ i - z\\ 2 + (2 - T m )\\z m+ i - z m \\ 2 + 2K0T^ n \\z m+ i - z\\ 2 < T m \\z m - z\\ 2 . (26) 

Considering Assumption 13 . 5[ (2 — r m ) ||z m +i — z m \\ 2 is nonnegative and the left-hand side term of inequality 
(|26|) can be lower bounded, so yielding 

T m \\z m +\ - z\\ 2 + 2n6T? n \\z m+ i - z\\ 2 < T m \\z m - z\\ 2 . 

Finally, by using the assumption that r = inf mS N r m > 0, we obtain 

\\z m+1 -z\\ 2 + 2K9r\\z m+i - S\\ 2 < \\z m - z\\ 2 . (27) 

This entails that ||z m +i — z\\ 2 < \\z m — z\\ 2 and, the sequence (\\z m — z\\) m&N being decreasing, there exists 
c G ]0, +oo[ such that lim m ^ +oc \\z m — z\\ = c. In turn, from |27|) . we conclude that lim TO ^ +00 z m+ i = z, 
which shows the strong convergence of (z m+ i/2) m eN to the unique minimizer of g\ + g 2 - D 

It can be noticed that, although the convergence of the Douglas- Rachford algorithm generally requires 
that T m < 2, the strong convergence is obtained under the above assumptions, when r m = 2. The limit case 
of the Douglas- Rachford corresponding to r TO = 2 is known as the Peaceman- Rachford algorithm [T7] . 



3.2.2 Computation of prox tc+7 ^ 

Let C be a nonempty closed convex set of 7i. The Douglas- Rachford algorithm can be used to compute 
prox tc+7 j where / € Tq{T-L) and 7 is a positive constant, using again the definition of the proximity operator: 

1 2 

(Vx G H) prox tc+7/ a; = argmin - \\y - x\\ + i c (y) + 7/(2/). (28) 

The above minimization problem appears as a specialization of Problem 13.21 by setting gi = 7/ and g 2 = 
5 II* — x\\ 2 + lq, provided that one of the following three conditions holds: 
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Assumption 3.7 



(i) C n intdom/ ^ 0. 

(ii) dom/ n intC 7^ 0. 

(iii) 7i is finite dimensional and rintC H rintdom/ 7^ 0. 



the desired proximity operator. Note that both prox Kgi and prox K£(2 with n > 0, have to be calculated to 



Subsequently, we propose to use the Douglas- Rachford algorithm in (jl9|) with a m = b m = 0, to compute 
the desired proximity operator. Note that 
apply this algorithm. In our case, we have 

P r ox Kgi = prox K7/ 



and, similarly to (|12|) . 

The resulting Douglas-Rachford iterations read: for every m G N, 

Z m+ 1 = P C 



1 + k J (29) 
2m+l = z m + r m (prox K7/ (2z„ l+ i - z m ) - z m+ i). 



This algorithm enjoys the following properties: 



Proposition 3.8 Suppose that one of Assumptions ^. WiJ\ \3. Wti)\ or \3. \iii) holds. Consider the algorithm 
in (|2"9")l where x € TL, inf me jqT m > and sup meN r m < 2. Then, 

(i) (z m+ i) meN converges strongly to prox tc+7/ cc; 

(ii) by setting k = 1 and zq = 2prox 7 ^x — x, we get: 

prox 7 jx G C (Vm G N) z m+ i = prox LC+jf x. (30) 



Proof, (i) As <?2 is strongly convex with modulus 1, (i) holds by invoking Proposition! 
(ii) Set k = 1, Zq = 2prox 7 jX — x with prox 7 ja; G C. By considering the first iteration of the Douglas- 
Rachford algorithm (m = 0), we have zi = prox^a; and z\ = zq. So, by induction, (Vm G N) z m+ i = 



prox^x, which is also equal to prox tc+7 ya; according to Proposition 12. l|(T)| □ 
Remark 3.9 

(i) As already observed in Remark \3.$ii)\ (|30|) is a desirable property. It shows that the proposed algorithm 
converges in one iteration when prox^jX £ C , which appears quite consistent in the light of Proposition 

IE 



(ii) Other choices can be envisaged for g\ and g2, namely 



(a) gi = \ \[ - x\\ 2 + l c and g 2 = 7/ 

( b ) 5i = 5 II ' - a^ll 2 + lf and 92 = ic 

(c) 9i = ic and 92 = h II • - ^ 1 1 2 + if- 



Nevertheless, the strong convergence of (z m+1 / 2 )meN in virtue of Proposition^^ is only guaranteed 
in the third case, whereas Property (|30[) holds only in the first case (when n = 1 and zq = x). The 
second case was investigated in \24f , where the good numerical behaviour of the resulting algorithm 
was demonstrated. 
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3.3 Discussion 



Both Algorithms (fTS")) and (j2T))) allow us to determine the proximity operator of the sum of the indicator 
function of a closed convex set and a function in Tq(TI). The main difference between the two methods is 
that, in the former one, the convex function needs to be diffcrentiable with a Lipschitz-continuous gradient, 
whereas the latter requires that the proximity operator of the convex function is easy to compute. In 
addition, the forward-backward algorithm converges linearly, while we were only able to prove the strong 
convergence of the Douglas-Rachford algorithm. As we have shown also, the two algorithms are consistent 
with Proposition |2~11T) 



4 Proposed algorithms to minimize / + g + lq 

We have presented two approaches to minimize the sum of two functions in Tq(H). We have also seen that 
these methods can be employed to compute the proximity operator of the sum of the indicator function of 
a closed convex set C and a function in Tq(H). 

We now come back to the more general form of Problem II. 1[ for which we will propose two solutions. 
Both of them correspond to a combination of the forward-backward algorithm and the Douglas-Rachford 
one. 



4.1 First method: insertion of a forward-backward step in the Douglas- 
Rachford algorithm 

We propose to apply the Douglas-Rachford algorithm as described in Section l3~2"l when g\ = / and 92 = 
LC + <?■ If we refer to (fT9|) . we need to determine prox Kffl = prox K j and prox Kg2 = prox tc+Kff , where k > 0. 
The main difficulty lies in the computation of the second proximity operator. As proposed in Section [3. 1.21 
we can use a forward-backward algorithm to achieve this goal. The resulting algorithm is: 

Algorithm 4.1 

© Set 7 G]0, 2k~ 1 /3~ 1 [, A e]0, 1] and k G ]0, +oo[. Choose (r m ) m6 N satisfying Assumvtion \3.^(i)\ 
© Set m = 0, z = z_i/ 2 G C. 

© Set X m fi = Zm-l/2- 

® Forn = 0,...,N m -l 

a) Choose ^ m , n G [7, 2k' 1 P' 1 [ and \ m , n G [A, 1]. 

b ) Compute 

V V 1 + 7m,« > J 

© Set z m+ i = x m> N m . 

© Set z m+ i = z rn + T m (prox K/ (2z m+ i - z m ) - z m+ i). 
® Increment m (m <— m + 1) and goto ®. 
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Step ® allows us to set the algorithm parameters and Step © corresponds to the initialization of the 
algorithm. At iteration m > 0, Step © consists of N m > 1 iterations of the forward-backward part of the 
algorithm, where possibly varying step-sizes (y m ,n)n and relaxation parameters (A ro ,ti)n arc used. Finally 
Steps © and © correspond to the Douglas-Rachford iteration. Here, the error term a m in the computation 
of prox K ^ is assumed to be equal to zero but, due to the finite number of iterations N m performed in Step 
©, an error b m = z m+1 / 2 — P r ox tc+Kg z m may be introduced in Step ©. 

It can be noticed that the forward-backward algorithm has not been initialized in Step © as suggested by 
Proposition 13. 3^ii)j Indeed, as already mentioned, the computation of prox Kg z m would be generally costly. 
Furthermore, the initialization in Step © is useful to guarantee the following properties: 



Proposition 4.1 Suppose that Problem \l.l\ has a solution and that one of Assumptions \3 . 1ft i)\ \3.1((ii)\ or 
\3. ffitg] holds. 



(i) Let £ > and p be given by (|16p . Ifini g(C) > —oo and, for every meN, the positive integer N m is 
chosen such that 



-V2^(g(z ) -Mg(C)) 1 < £ if m = (31a) 

^-i(l + £-y- m ||2m-2 m -i||) <1 z/m>0 (31b) 



then, (z m ) m £N converges weakly to z € Ti such that prox LC+Kg z is solution to Problem M.l 
(ii) For every m G N, (x mirl )o<n<A r m (and thus, z m+ i/ 2 ) lies in C. 



Proof, (i) According to Proposition 13. 3^Tj| for every m 6 N, 

(Vn e {0, . . . , N m }) \\x m . n - prox tc+Kff z m || < p n \\x m>0 - prox LC+Kg z n \\ 
and, consequently 

\\b m \\ = \\z m+ i/2 - prox tc+Kg z m |) < p Nm \\z m -i/2 - WOX lc+Kg z m \\. (32) 
Let us next show by induction that Conditions (|31a[) and (|31b|) allow us to guarantee that 

IIM < P m t (33) 

• If m = 0, we deduce from (f3"2"| that 

\\bo\\<p No \\zo-pro Xlc+Kg zo\\. (34) 
From the definition of the proximity operator, we have 

(VxeC) ^\\z -x\\ 2 + ng(x) 

> -\\z - prox tc+Kg z |j 2 + K. 9 (prox tc+Kg z ) 



> t:\\ z o- prox tc+Kff z || 2 + k mfg(C) 



and, since € C, 



\\z - prox, c+Kg z || 2 < 2n(g(z ) - Mg(C)). 
By combining the latter inequality with (|3"4"|) and (|31ap . we conclude that ||&o| < £• 
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Now, let us show that (f3"3"| holds for m > 0, by assuming that ||b m _i|| < p m 1 £. Using (|3"!2)) , we have 

H&mll < P Nm (lkm-i/2 - prox, c+Kg z m _ 1 + prox lc+res z m _i - prox tc+K9 z m ||) 

< ^(H&m-ill + l|prox tc+K9 z m _i - prox, c+Kg z m ||) 

< p Nm (H&m-iH + \\z m -i - z m \\) 

where the nonexpansivity of prox tc+K£( has been used in the last inequality. From the induction 
assumption, we deduce that 

||bm|| <p Nm (p m -^+\\z m -l-Z m \\) 

which, according to (|31b[) . leads to (f33| . 



Then, ([55)1 allows us to claim that Assumption (|3.5[ (ii) is satisfied since 



T ™(h,n\\ + ||M) < 2?(1 - p)- 1 . 



By further noticing that Assumption 13.71 is equivalent to (i) (dom (lq + g) H intdom/ ^= 0), (ii) 



(dom / fl int dom (lq + g) 7^ 0) or, (iii) Ti is finite dimensional and (rint dom / n rint dom (lq + g) ^ 0), the 



conditions for the weak convergence of the Douglas-Rachford algorithm are therefore fulfilled. 



(ii) The property can be proved by induction by noticing that xq,o = z -i/2 € C and that a; TOjn _)_i is a 



convex combination of x mj „ and the projection onto C of an element of 7Y.D 

Eqs. (|31a[) an( l (|31b|) constitute more a theoretical guaranty for the convergence of the proposed al- 
gorithm than a practical guideline for the choice of N m . In our numerical experiments, these conditions 
were indeed observed to provide overpessimistic values of the number of forward-backward iterations to be 
applied in Step ©. 



As a consequence of Proposition 14. ^{u)\ in Step @b), the gradient of g is only evaluated on C. This 
means that the assumption of Lipschitz-continuity on the gradient of g is only required on C and therefore, 
the algorithm can be applied to the following more general setting: 

Problem 4.1 Let C be a nonempty closed convex subset of Ti. Let f and g be in Tq(TI), where g is 
differentiable on C with a (3-Lipschitz continuous gradient for some (3 G ]0, +oo[o 

Find min f(x) + g{x). 



Note that, in the latter problem, the function g does need to be finite. 



4.2 Second method: insertion of a Douglas-Rachford step in the forward- 
backward algorithm 

For this method, a different association between the functions involved in Problem ll.ll is considered by setting 
fx = ic + / and /2 = g. Since fi has then a /3-Lipschitz continuous gradient, we can apply the forward- 
backward algorithm presented in Section 13.1.11 This requires however to compute prox 7n ^ 1 = prox tc+7n j, 
which can be performed with Douglas-Rachford iterations. 

Let us summarize the complete form of the second algorithm we propose to solve Problem 11.11 

3 That is there exists an open set containing C on which g is differentiable with a /3-Lipschitz continuous gradient. 



14 



Algorithm 4.2 



© Choose sequences (7n)neN and (X n )neN satisfying Assumptions \3. and (ii) Setr s]0,2]. 

© Set n = 0, x Q £ C. 

© Set x' n = x n - j n Wg(x n ). 

® Set z n>0 = 2prox 7n/ a;^ - x' n . 

© For m = 0,...,M n -l 

a) Compute z nm+ i = P c ( Zn ' m ^ Xn ) . 

b) Choose T n ,m S \t, 2]. 

Compute z n ,m+l = z„, m + r n , m (prox 7n/ (2z n jTO+ i - z„ jTO ) - z nm+ i). 
d) If Z n , m +i = z n>m , then goto ©. 
© Set x n+ i = x n + A„(z n>m+ ! - x n ). 
® Increment n (n <— n + 1) cmd go£o ®. 

We see that Step © consists of at most M n > 1 iterations of the Douglas-Rachford algorithm described 
in Section 13. 2. 2\ which is initialized in accordance with Proposition 13 . iffiii)] Steps © and © correspond to a 
forward-backward iteration. Let m n < M n be the iteration number where the Douglas-Rachford algorithm 
stops. The error terms involved in Step © are a n = z nrrin+ i — prox LC+7n fX n and b n = 0. The properties of 
the algorithm are then the following: 



Proposition 4.2 Suppose that Problem \l.l\ has a solution and one of the Assumptions \3. 'tf(i% \3. lKii)\ or 
ETffm)! holds. 



(i) There exists a sequence of positive integers (M n ) ne ^ such that, if {in € N) M n > M n then, (x n )ner 
converges weakly to a solution to Problem ll.li 

(ii) The sequence (i n ) ne n lies in C . 



Proof, (i) Set p e]0, 1[. Let n€N and (z n ,m)meN be defined by iterating Steps ©a), b) and c). By invoking 
Proposition 13 . £^i)| we know that (z n m+ i ) m eN converges strongly to prax lc+Jn fx' n . This implies that there 
exists M n > 1 such that 

(VmeN) m>M„-l => \\z n>m+ i - piox LC+lnf x'J < p n . 
If M n > M n , we deduce that 

IK II = ll z n,m n +i -prox, ( , + . i>/ .r;,|| < p" 

since either m n = M n — 1 or the algorithm stops in Step ©d) (in which case z n ,m„ is a fixed point of 
the recursion in Step ©c) and z nmn+ i = prox (c+7n yX^). We therefore have SneNll a ™ll < +°° anc ^ the 
conditions for the weak convergence of the forward-backward algorithm arc fulfilled. 



ii) Wc have chosen xq in C. In addition, (Vn G N) (z n m+ i) m lies in C and x n +i is convex combination 



of x n and z n m+ i. Hence, it is easily shown by induction that (Vn > 1) x n € C. □ 
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Proposition I4.1](i)| guarantees that, by choosing M n large enough, the algorithm allows us to solve 
Problcm ll.il Although this result may appear somehow imprecise regarding the practical choice of M n , it 
was observed in our simulations that small values of M n arc sufficient to ensure the convergence. 

In addition, as a direct consequence of Proposition l4.^ii)| in Step ®, the gradient of g is only evaluated 
on C. This means that, similarly to Algorithm 14. 1[ this algorithm is able to solve Problem [41] In the next 
section, we will show that a number of image restoration problems can be formulated as Problem 14.11 



5 Application to a class of image restoration problems 



5.1 Context 



We aim at restoring an image y in a real separable Hilbert space Q from a degraded observation z G Q. Here, 
digital images of size N\ x 7V2 are considered and thus Q = WL N with N — N1N2. Let T be a linear operator 
from Q to Q modelling a linear degradation process, e.g. a convolutive blur. The image u = Ty (resp. 

z = (z^)i<{<n) is a realization of a real-valued random vector U = (U 1 )i<%<n (resp. Z = (Z^)i<i<jv)- 
The image U is contaminated by noise. Conditionally to U = (u^) i<i<N G Q, the random vector Z 
is assumed to have independent components, which are either discrete with conditional probability mass 
functions {n z ^i)_ u ^ )i<i<N, or absolutely continuous with conditional probability density functions which 
are also denoted by (p z r^w — u (t))i<i<N • I n this paper, we are interested in probability distributions such 
that: 

(Vie{l,...,JV})(Vt;eR) M z( , )|F <o = > W )<xexp(-^->)) (35) 
where the functions {tpi)\<i<N take their values in ] — 00, +00] and satisfy the following assumption. 

Assumption 5.1 There exists a nonempty subset I of {1, . . . ,7V} and a constant S £ K such that, for all 

ie{i,...,N}, 

(i) domipi =]5,+oo[ if i € I and, dom^i = [5, +oo[ if i ^ I; 

(ii) if i € I. then tpi is twice continuously differ entiable on ]S, +oo[ such that ia£ v ^]s ! -\- o[' l Pi( v ) > — 00 an d 

lim ijji{ v ) = +00. 

v — »(5 

v>8 

Its second- order derivative is decreasing and satisfies 

lim ^»=0; 

v — »+oo 



(iii) ifi^I, then there exists on G [0, +oo[ such that (Vi> G [5, +00 [) ipi( v ) = cuv. 



From Assumptions 
(Vu e}6, +00 [) ip'/(v) >"0)~such that 



il)j and (iii) it is clear that the functions (^>j)i<,<iv arc convex (since (V? G I) 

(36) 



lim ^-'(tO 

V — >0 

v>5 



-OO 



and they are lower semicontinuous (since (Vi G {1, . . . , N}) hminf„^^ ipi(v) > ipi(5)). Examples of such 
functions will be provided in Sections 15.31 and 15.41 
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In addition, a both simple and efficient prior probabilistic model on the unknown image y is adopted by 
using a representation of this image in a frame [21] [29] . The frame coefficient space is the Euclidean space 
H = M. K (K > N). We thus use a linear representation of the form: 

y = F*x 

where F* : TL — > Q is a frame synthesis operator, i.e. vld < F* o F < V Id with {v_,V) G ]0, +oo[ 2 (which 
implies that F* is surjective)0 We then assume that the vector x of frame coefficients is a realization of a 

random vector X with independent components. Each component X with k G {1, . . . , K} of X, has a 
probability density exp(— 0fc('))/ ex P( — fiki 7 ])) dr\ where is a finite function in r (M). 

Finally, we assume that we have prior information on x which can be expressed by the fact that x 
belongs to a closed convex set C of TL. The constraint set C will be assumed to satisfy: 

(TC*) ndom* ^ (37) 

where 

C* = F*C = {F*x | x G C) 

and 

JV 

(Vu=(«W) 1 < i < Ar ee) *(«) = X;V'i(« w ). 



With these assumptions, it can be shown (see Q3]) that a Maximum A Posteriori (MAP) estimate of 
the vector of frame coefficients x can be obtained from z — (z^) 1<i<N by minimizing in the Hilbert space 
Ti the function / + g + lq where 

(Vx = (x {k) ) 1<k<K G H) /(x) = f]^(^ (fc) ) (38) 

fc=i 

and 

g = $oToF*. (39) 



We consequently have: 

Proposition 5.2 Let H = R K and Q = l w twift K > N. Let f and g be defined by ((38]) anrf |39 
respectively, where T:Q Q is a linear operator. Under Assumvtion \5. 1\ and Condition (|37[) . then 



(i) / and g are in Yq{H); 

(ii) if f is coerciv^ or dom g f] C is bounded, then the minimization of f + g + lq admits a solution. Ln 
addition, if f is strictly convex on dom g D C, the solution is unique. 



Proof, (i) It is clear that / is a finite convex function of Tt. As the functions (^i)i<»<JV are in ro(lR), 
belongs to Tq(Q). In addition, by using ([37]), we have ran (ToF*)fl dom 'I' ^ 0. This allows us to deduce 
that dom g ^ and, therefore, g G Tq(TI). 



'ii) We have dom/ n domg flC/0 since dom / = TL and ([37]) shows that dom g DC — dom(5' o T < 



.F*) n C/0. Since / and g are in Tq(TI), we deduce that / + g + lq is in Tq(H). 



4 The existence of the lower bound implies the existence of the upper bound in finite dimensional case. 
5 This means that lim|| I ,||__|_ 00 f(x) = +oo. 
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Suppose now that / is coercive. By Assumption I5.l|(ii)j (Vi G I) irf v e]s.+oo[ipi( v ) > whereas, due to 
Assumption I5.l||^iii)[ (Vi ^ I) iiii v e[s.+oc[i } i( v ) = This implies that inf ^(Q) > — oo and, consequently, 
inf g(7i) > inf \I/((7) > — oo. As a result, / + g + ic > f + i-c + inf g(T~C) is coercive. When dom<? n C is 
bounded, / + g + ic also is coercive. The existence of a solution to the minimization problem follows from 
classical results in convex analysis [26l Chap. 3, Prop. 1.2]. 

When / is strictly convex on dom g n C, the uniqueness of the solution follows from the fact that f + g + Lc 
is strictly convex [26l Chap. 3, Prop. 1.2]. □ 

Remark 5.3 The function f is coercive (resp. strictly convex) if and only if the functions (4>k)i<k<N are 
coercive [141 Prop. 3.3(iii)(c)] (resp. strictly convex). 

5.2 Quadratic extension 

If we now investigate the Lipschitz-continuity of the gradient of g, it turns out that this property may be 
violated since ^ is not finite. Due to ([55]) . the gradient of g is not even guaranteed to be Lipschitz-continuous 
on int dom g. 

To circumvent this problem, it can be noticed that, because of Assumption 1 5. l|t ii)j and (|36|) . for all i G I, 
there exists a decreasing function V{ : ]0, +oo[ —►]<£, +oo[ such that limg^+oo Vi(9) = 5 and 

(ye G ]0, +oo[)(Vt; €}6, +oo [) < 0f (u) < 6> ^ w > y*(0). (40) 

Let us now consider the function go ~ ^ o T o F* with 6 G ]0, +00 [, where 

(v" = (^)kk W ^) <M«) = £iM« {<) ) 

i=l 

and the functions (tpe,i)i<i<N are chosen such that, 

( ^ 2 + Ci,i (#) w + Ci,o(») if « G I and S — e(9) <v< v t {9) 
(Vv G K) - I a . v if i ^ I and S — e{6) <v<5 ( 41 ) 

I ipi(v) otherwise. 

Hereabove, e : ]0, +oo[ — * ]0, +oo[ is a decreasing function and, 

(Vi G I) 6, o (0) = - (0)$M*)) + \{<d)) 2 

For every i £ I, the constants Ci,o(9) and have been determined so as to guarantee the continuity of 

ipffj and of its first order derivative at Vi(0). Consequently, the following result can be obtained: 

Proposition 5.4 Suppose that Assumption \5. 1\ and Condition (|37|) hold. Then, 

(i) (V0 g ]o,+oo[) 3e er (7i). 

(ii) (V(0i,0 2 ) G ]0,+oo[ 2 ), 0i < 2 ^ fffll < 59 2 < .9- 

(iii) for every 8 G ]0, +00 [, ifTC* c]S — e(8), +oo[ N , £/iera g# /ias a Lipschitz-continuous gradient over C 
with constant [3 g = 9\\TF*\\ 2 < 6v\\T\\ 2 . 
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(iv) For every 9 £ ]0, +oo[, if f is coercive or ifdomgg n C is bounded, then the minimization of f +gg + bc 
admits a solution. In addition, if f is strictly convex on dom gg PI C , then f + gg + be has a unique 
minimizer xg . 

(v) Assume that 

(a) lim e ^ +00 e(8) = 0, 

(b) TC* c [5,+oo[ N , 

(c) / is coercive or C is bounded, 

(d) / is strictly convex on C. 

Then, there exists 8 £ ]0, +oo[ such that, for every 8 £ [8, +oo[, the minimizer xg of f + gg + be is 
the minimizer of f + g + be ■ 



Proof, (i) Since ^g is defined and continuous on [S — e(9), +oo[ N and, (Vi £ {1, . . . , N}) (Vi> €]S — e(6), +oo[) 
tpgi(v) > 0, we have * e £ To(Q). In addition, dom* e n ran(To F*) D dom* n ran(To F*) ^ 0. Thus, 

99 € r (H). 



(ii) As a consequence of (|4"T|) and (f4*0|) . we have, for every i £ I, 

(Vve]5,v t (8 2 )[) ^(v) > = fe. 

So ^ — j is a strictly increasing function over ]S, ^(#2)] and 

(Vt> £]<U> 4 (# 2 )[) - < ~ ^.iM^)) = 

which, in turn, shows that ipi — ipe 2 ,i is strictly decreasing on ]S, t>i(#2)] and 

(Vv £]<W0 2 )[) ^(u) - Ve 2 » > ^i(«i(fl 2 )) ~ i>e 2 ,i(vi(6 2 )) = 0. 

In addition, we know that, if (i £ I and u < 6) or (i ^ I and v < S), then ipi(v) = +oo and, if (i £ I and 
w > ^1(82)) or (i I and u > (5), then = ipe 2 ,i{v). We deduce that, for alH £ {1, . . . , iV}, > '>p0 2 ,i 

and, therefore <? is lower bounded by gg 2 . 
By proceeding similarly, we have, for every i £ I, 

(Vw £ [i>i(0i), +00 [) ^e 2 ,i( v ) = i ) i( v ) = i>eiA v ) 

(Vv £]<5 - c(ft!),«i(fii)D $U U ) > 0i = 

(VuG]*-e(fl a ),t;i((?i)D ^,,M<i,,M 

(Vve[*-c(fi 3 ),u<(fli)D * 2 ,iM>* 1 ,iW- 

In addition, 

(V* £ {1, ... , iV})(Vu £] - 00, S - e{8 2 )[) = +00 > 

and 

(Vi £ I)(Vu £ [5 - e{8 2 ), +oo[) Vs 2 » = ^g ul {v). 
This shows that ^g 2 > and, consequently, gg 2 > . 

(iii) As already mentioned, dom'i'g = [6 — e(6), +oo[ N . Consider 

Og = (TF*)- 1 {]S-e(8),+oo[ N ) = {x £ H \ TF* x e]8 - e{9), +oo[ N }. 

Og is an open set and, as TC* c]S — e(6), +oo[ N , we have: C C Og. In addition, the function gg is 
differentiable on Og and its gradient is [551 Chap. 1, Prop. 5.7] 

(Vx £ Og) Vgg(x) = FT*(W g (TF*x)) (42) 
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where 

(Vu - ( U (l) )i< 4 <„ e]6 - e(9), +oo[ N ) V^ g (u) = (^> W ))i 

We have then 

(Vu = («W)i< 4 <„ e}5 - e(6), +oo[ N ) (Vt> = (u (i) )i<i<„ g]<S - e(6), +oo[ N ) 

N 



<i<N' 



v* e (u) - v* (v)\\ = ( J2 - ^,i(* (<) ))' 



and, by the mean value theorem, 



(Vie {1,..., A}) |^,i(« w )-^> w )|<l« 



,(0_„(0i 



»=i 



sup K t4 (e)| 

5e]<5-e(e),+oo[ 



1/2 



< % 



W_ W W| 



This yields 

(W €]<5 - e(0), +oo[ w ) (Vv e}5 - e(0), +oo[ N ) 
and, we deduce from (|42p that 

(V(x,x')eO e 2 ) 
and ||TF*|| 2 < ||F|| 2 ||r|| 2 < 77||T|| 2 . 



|V*fl(it)- Vtf 9 (w) 



< 



||V ff9 (a;) - V 5e (x')|| < 0||TF*|| 2 ||a; - x'\\ 
iv) The proof is similar to that of Proposition 1 5. ffiii)] 



(v) In the following, we use the notation: h = f 



and 



G ]0,+oo[) h e = f + ge + ic- 



Let (8e)t£N be an increasing sequence of ]0,+oo[ such that lim^ +00 dp = +oo. As a consequence of (i) 



and (ii) (h$ t )teti is an increasing sequence of functions in T (H). We deduce from [41] Proposition 7.4(d)] 
that (Zi^)igN epi-converges to its pointwise limit. By using (|41 [) in combination with the facts that (Vi G I) 
lime^ +00 i>i(#) = (5 and lime^ +oc e{6) = 0, we see that the pointwise limit is equal to h. 
Under Assumptions (v)(b) and (v)(c) (W G N) hg e is coercive since C n dom gg e = C. Equivalently, 
its level sets lev^ hg e =~lx G H | hg e (x) < 77} with 77 G K, are bounded. (h$ t )ee^ being a sequence of 
increasing functions, U^gj^lev^ hg t = lev< r) hg is bounded. As the functions hg e with f eN and h are lower 
semicontinuous and proper, [411 Theorem 7.33] allows us to claim that the sequence {xe e )e^N converges to 
the minimizer x of h (by Assumptions (v) (b) and (v) (d) both hg e with £ G N and h have a unique minimizcr 



due to the strict convexity of / on (C n doing) c(Cfl dom gg e ) = C and, Propositions 15. ffii)] and l5"^(iv) ). 
As x G dom/i, (Vi G I) (TF*x)^ G dom^ =]£, +oo[, where, for every x G H and i G {1, . . . , A}, (TF*x)W 
denotes the i-th component of vector TF*x. Since lini£_> +00 xg e = x, we have, for every i G I, 



(V?y G ]0,+oo[)(3^ ji G N) such that 

(WeN) ^>^,i=^|(TF*x e J« 



(TF*x) w | <?7 
>(TF*x e ,) (l) > min(TF*z) (l) - 



By setting rj = 



min ieI (TF*£)« 



where 



S + mm.i e i(TF*x 



,(') 



- > and £ r> = max^gi £ v ^ 1 we deduce that 
(W G N) £ > £ n => (TF*2 fl£ ) W > „ 

> <5. In addition, since lirmj__> +oc ^ = +oo =>• limf_> +00 maxjgi Vi(6i) 



(43) 



there exists £ > £ n such that (Vi G I) Vi(9j) < v. By using (|4Tj) . this implies that (Vi G I) (Vu G \v, +co[ 
4>9 r ,i( v ) = ^i(^)- By defining now 



fl = {i£ dom 5 | (Vi G I) (TF*x) {l > G [w,+oo[} 
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we deduce that (Vx e D) hg (x) = h(x). Moreover, according to Assumption (v)(b) for every £ G N, if 



{TF*x 9e )W G [S,+oo[. (44) 

Altogether, |43|) and (|44|) show that both xg 7 and x belong to D. Consequently, as xe- = argmin^g-H hg 7 {x), 
we have: h(xg 7 ) — hg 7 (xg 7 ) < hg 7 (x) = h(x), which proves that xg 7 = x. 



Considering now 9 G [Qj, +00 [, from (ii) we get: hg 7 < hg < h. Thus, h(x) = hg J {x) < hg{x) < which 
results in hg(x) = h{x), while 

(Vx G 7Y) hg(x) > hg 7 (x) > hg 7 (x) = h(x). 
This allows us to conclude that xg = x as soon as 9 > 6j = 9. □ 

Remark 5.5 (i) A polynomial approximation of the objective function was considered in \2T^ which is 
different from the proposed quadratic extension technique. 



(ii) As expressed by Proposition \5.$ii)\ gg (resp. f + gg + ic) with 9 > constitutes a lower approximation 
of g (resp. f + g + be), which becomes closer as 9 increases. 



(hi) As shown by Proposition \5.J^[iii) the main role of parameter 9 is to control the Lipschitz constant of 



the gradient of this approximation of g. 

(iv) At the same time, Proposition \5.)^[v\ indicates that this parameter allows us to control the closeness of 
the approximation to a minimizer of the original MAP criterion. This approximation becomes perfect 
when 9 becomes greater than some value 9. 

Under the assumptions of Proposition l5.^iii)[ the minimization of / + gg + lq with 9 G ]0, +00 [ is a problem 
of the type of Problem l4.ll Therefore, Propositions 14. II and 14.21 show that, provided that / is coercive or C 
is bounded, Algorithms 14.11 and 14.21 can be applied in this context. In addition, Proposition 15. ^v)] suggests 
that, by choosing 9 large enough, a solution to the original MAP criterion can be found. However, according 



to Proposition 15 .^(iii) a large value of 9 induces a large value of the Lipschitz constant [3g. This means that 
a small value of the step-size parameter must also to be used in the forward iteration of the algorithms, 
which is detrimental to the convergence speed. In practice, the choice of 9 results from a trade-off as will 
be illustrated by the numerical results. 



5.3 First example 
5.3.1 Model 

We want to restore an image y G [0, +00 [ N corrupted by a linear operator T : Q — > Q and an additive noise 
w G Q, having the observation 

z = Ty + w = u + w. 

In addition, the linear operator T is assumed to be nonnegative-valued (in the sense that the matrix 
associated to T has nonnegative elements) and, w = (u>^)i<i<jv is a realization of an independent zero- 
mean Gaussian noise W = (W (i) ) i<i<7V- The variance of each random variable with i G {1, . . . , N} is 
signal-dependent and is equal to af(u^ 1 ') where 

(V^e[0,+co[) = 
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with on G ]0, +oo[. So, the functions {ipi)\<i<N as defined in (|35p are, when z« 7^ 0, 

(Vu G M) ipi(v) 

and, when zW = 0, 



aAv — z^) 

i- if u g 0,+oo 

u 

-00 otherwise 



ctiV if w G [0, +00 [ 
+00 otherwise. 



(Vd e t) ipi(v) 

So, provided that z 7^ 0, Assumption 15.11 is satisfied with 5 = and I 
{i G {1, . . . ,N} I zW 7^ 0} since, for all i G I, 

(VuG]0,+oo[) $(v)=aT 



We deduce from (|4"0"1) that, for every i G I, 



r- 

2a ?; (z«) 2 



(V6» G 



]0,+ooD «*(*) = (^^-) ■ 



5.3.2 Simulation results 



Here, T is either a3x3ora7x7 uniform convolutive blur with ||T|| = 1. The 512 x 512 satellite image y 
(N = 512 2 ) shown in Fig.QJa) has been degraded by T and a signal-dependent additive noise following the 
model described in the previous section with on = 1 or on = 5. The degraded image z displayed in Fig. QJb) 
corresponds to a 7 x 7 uniform blur and on = 1. 

A twice redundant dual-tree tight frame representation [15] (v_ = V = 2, K = 2N) using symlet filters 
of length 6 |21j has been employed in this example. The potential functions 4>k are taken of the form 
Xk\- \ + . \ Pk where (xk, Wk) G ]0, +oo[ and pu G {4/3, 3/2, 2} are subband adaptive. These parameters 
have been determined by a maximum likelihood approach. The function / as defined by (|38j) is therefore 
coercive and strictly convex (see Rcmark l5.3[) . 

A constraint on the solution is introduced to take into account the range of admissible values in the 
image by choosing 

C* = [0,255]". (45) 

Due to the form of the operator T, TC* = C* and Condition (|37|) is therefore satisfied. Proposition 15.21 
thus guarantees that a unique solution x to the MAP estimation problem exists. According to Proposition 
I5.^iv)[ for every 9 £ ]0, +00 [, a unique minimizer xg of f+gg + t-c also exists which allows us to approximate 
x as stated by Proposition 1 5 . 4[ vj| 

Since, for every 9 G ]0, +00 [, TC* = C* C [—e(9),+oo[ N , Proposition I5.^iii)j shows that gg has a 
Lipschitz-continuous gradient over C and Algorithms 14.11 and 14.21 can be used to compute xg. The two 
algorithms are subsequently tested. 

On the one hand, when Algorithm 14. II is used, the initialization is performed by setting z$ — Pcz and 
we choose k = 60 and r m = 1. The projection onto C = (F*)~~ 1 C* is Pc = prox t toF * which can be 
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computed by using Proposition 12.61 with L = F* . The other parameters have been fixed to X m . n = 1 and 
7 mj „ = 0.995/ (k0). in compliance with Proposition 15. ^iii)| The convergence of the algorithm is secured by 
Proposition 14.11 since Assumption 13. jffij] trivially holds. However, to improve the convergence profile, the 
following empirical rule for choosing the number N m of forward-backward iterations has been substituted 
for the necessary Conditions (|31a| and (|31b|) : 

N m = inf {n 6 N* | \\x m , n - x m , n _ x \\ < 77} (46) 

with r/ = 1CT 4 . 

On the other hand, when Algorithm 14.21 is used, the parameters have been chosen as follows : A„ = 1 , 
r n>m = 1 and 7„ = 0.995/0. The algorithm has been initialized by setting xo = Pcz where the projection 
onto C is computed as described previously. The convergence of the algorithm is ensured by Proposition l4.2l 
The number M n of Douglas-Rachford iterations has been fixed as follows: 

M n = inf {m G N* | \\z n<m - z n ,m-l\\ < v} (47) 

with the same value of r\ as for the first algorithm. 

The error between an image y and the original image y is evaluated by the signal to noise ratio (SNR) 
defined as 20 log 10 (||y||/||y- V\\)- 

Three objectives are targeted in our experiments. First, we want to study the performance of the 
proposed approach, using the redundant dual-tree transform (DTT). The results presented in Tab. [T] have 
been generated by Algorithm 14. 1[ but Algorithm 14.21 leads to the same results. 





3x3 blur 


7x7 blur 




9 


0.025 


0.05 


5 


7 


0.025 


0.05 


5 


7 


Q-i = 1 


SNR 


13.9 


16.3 


16.8 


16.8 


10.9 


11.9 


12.1 


12.1 




e 


0.15 


0.25 


10 


12 


0.15 


0.25 


10 


12 


on = 5 


SNR 


15.9 


18.0 


18.8 


18.8 


12.6 


13.3 


13.7 


13.7 



Table 1: SNR for the satellite image. 



As suggested by Proposition E^v)] as 9 increases, the image is better restored. The effectiveness of the 
proposed approach is also demonstrated visually in Fig. [TJc) showing the restored image when T is a 7 x 7 
uniform blur, on = 1 and 9 = 0.05. It can be observed that the algorithm allows us to recover most of the 
details which were not perceptible due to blur and noise. 

Secondly, we aim at comparing the two proposed algorithms in terms of convergence for a given value 
of 9. In Fig. [2J the MAP criterion value is plotted as a function of the computational time for a 7 x 7 blur, 
on = 5 and 9 = 0.25. For improved readibility, the criterion has been normalized by subtracting the final 
value and dividing by the initial one. It can be noticed that Algorithm 14.21 converges faster than Algorithm 
14. 11 This fact was confirmed by other simulation results performed in various contexts. 

Finally, Fig. [3] illustrates the influence of the choice of the parameter 9 when Algorithm 14.21 is used for 
a 7 x 7 blur and on = 5. As expected, the larger 9 is, the slower the convergence of the algorithms is. A 
trade-off has therefore to be made: 9 must be chosen large enough to reach a good restoration quality but 
it should not be too large in order to get a fast convergence. 
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(c) 

Figure 1: Results for a satellite image of the city of Marseille, (a) Original image, (b) degraded image, (c) 
restored using a DTT. 
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ot, ^~r~ i i 

O 500 1000 1500 2000 



Figure 2: Normalized MAP criterion (Algorithm 14. II in red and Algorithm I4.2l in blue) versus computational 
time (in seconds) (Intel Xeon 4 Core, 3.00 GHz). 




1000 2000 3000 4000 5000 



Figure 3: Normalized MAP criterion (for 9 = 0.25 in green and 9 — 10 in magenta) versus computational 
time (in seconds) (Intel Xeon 4 Core, 3.00 GHz). 



25 



5.4 Second example 



5.4.1 Model 



In this second scenario, we want to restore an image y G [0, +oo[ A ' which is corrupted by a linear operator 
T : Q — > Q, assumed to be nonnegative- valued and, which is embedded in (possibly inhomogencous) Poisson 
noise. Thus, the observed image z = (^W)i<t<jv G N w is Poisson distributed, its conditional probability 
mass function being given by 

(Vie{l ) ...,jV})0A,e[0 ) +oc[) ^ (i)| _ w=v (^)) = i^^exp(-a iU ) (48) 

where (ai)i<i<N € ]0, +oo[ W are scaling parameters. 

Consequently, using (|35[) and (|48p. for every i G {1, . . . , AT}, we have, when zW > 0, 

,(*)• 



(v. 6 R ) ^ ( «) = | ^ - * w + * w in if w e ] °' +o ° [ 

[ +oo otherwise 

and, when arM — 0, 

if v € [0, +oo[ 



(49) 



(Vu G R) Vi(^) 



-co otherwise. 



As the functions (ipi)i<i<N are defined up to additive constants, these constants have been chosen in Ij49j) 
so as to obtain the expression of the classical Kullback-Leibler divergence term [10] . 

In this context, provided that z ^ 0, Assumption 15.11 holds with 6 = and I = 
{i G {1, . . . , N} | zW > 0} since, for all i G I, 



z (i) 

(VuG]0,+co[) ip' i (v) = a i 

v 

We deduce from (|40|) that, for every i G I, 

(WG]0,+oo[) ^(0) = 

Remark 5.6 At this point, it may be interesting to compare the proposed extension with the approach 
developed in [24] . The use of the Anscombe transform [2], in |24j is actually tantamount to approximating 
the anti log-likelihood ipi of the Poisson distribution by 




2 



(V.GR) Mv) = < h W atV+ i- Z ) 1/WG[0 ' +00[ (50) 
-oo otherwise. 

The proposed quadratic extension is illustrated in Fig. ^ where a graphical comparison with the Anscombe 
approximation is performed. 
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250 




Figure 4: Graph of the function ipt (black continuous line) when 8 = 0, a,; = 1, z^ = 100. Its quadratic 
extension ipg^ with 9 = 0.2 (purple dashed line) and 9 = 1 (red dashed line) for e(9) = 10 -16 and its 
Anscombc approximation ipi (cyan continuous line). 



5.4.2 Simulation results 



Here, T is a 5 x 5 uniform blur with ||T|j = 1. A 256 x 256 (N = 256 2 ) medical image y shown in Fig. [51(a) 
is degraded by T and corrupted by a Poisson noise following the model described in the previous section for 
various intensity levels. The degraded image z is displayed in Fig.[5](b) when 014 = 0.01. 

An orthonormal wavelet basis representation has been adopted using symlets of length 6 (v_ = V = 1, 
K = N). The potential functions <f>k are taken of the same form as in the first example and, the function / 
is therefore coercive and strictly convex. 

The constraint imposed on the solution is given by C = (F*)~ 1 C* where C* is defined by (|4"5")) . Since 
TC* = C*, Proposition &giv) 



been computed with Algorithm 



guarantees that a unique minimizer xg of / + gg + be exists, which has 



4.U The algorithm has been initialized by setting zq = Pqz and, we have 
chosen ~/ m ,n = 1.99/(/c0), k = 60 and X m ,n = r m = 1. The number of forward-backward iterations is given 
by (j46|) with rj = 10~ 4 . Note that the convergence rate could be accelerated by using adaptive step-size 
methods such as the Armijo- Goldstein search [HI [24]. However, the computational time of the step-size 
determination should be taken into account. 

To evaluate the performance of our algorithm we use the Signal to Noise Ratio defined in Section l"5. 3. 21 
Tab. [5] shows the values of the SNR obtained for different values of a% and 9. As predicted by Proposition 
I5.4|(v)[ beyond some value of 9, which is dependent of a*, the optimal value is found. We also compare 
our results with those provided by two different approaches. The first one is the regularized Expectation 
Maximization (EM) approach (also sometimes called SMART) [TO) [31] where the Poisson anti-likelihood 
penalized by a term proportional to the Kullback-Leibler divergence between the desired solution and a 
reference image is minimized. Its weighting factor has been adjusted manually so as to maximize the SNR 
and, the reference image is a constant image whose pixel values has been set to the mean value of the 
degraded image. The other approach is the method based on the Anscombe transform proposed in [24] and 
discussed in Remark 15.61 For fair comparisons, the method here employs the same orthonormal wavelet 
representation, the same functions {4>k)i<k<K as ours and the same constraint set C. It can be observed 
that the approach we propose gives good results. However, for high intensity levels (c^ > 0.1), the method 
based on the Anscombe transform performs equally well in terms of SNR. The restored images are shown 
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in Fig. [51 when a* = 0.01 and 9 = 0.001 after 3000 iterations. In spite of an important degradation of the 
original image, it can be seen that our approach is able to recover the main features in the image. It can 
also be noticed that the image restored by the two methods exhibit different visual characteristics. 
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9 = 0.005 


e = o.i 


e = 1 
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0.01 
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8.24 
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9.75 
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9.75 
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11.9 


11.9 


11.9 


0.1 


10.1 


12.4 


12.0 


12.5 


12.5 


12.5 


12.5 


1 


13.8 


15.1 





10.1 


13.7 


15.1 


15.1 



Table 2: SNR for the medical image. 



6 Conclusion 

Two main problems have been addressed in this paper. 

The first one concerns the minimization on a convex set C of a sum of two functions, one of which g 
being smooth while the other may be nonsmooth. Such a constrained minimization has been performed by 
combining forward-backward and Douglas-Rachford iterations. Various combinations of these algorithms 
can be envisaged and the study we made tends to show that Algorithm 14.21 is a good choice. It can be 
noticed that adding a constraint on the solution for a restoration problem was shown to be useful in another 
work [ID], where it appeared that the visual quality of the restored image can be much improved w.r.t. the 
unconstrained case, when both restoration approaches are applicable. 

The second point concerns the quadratic lower approximation technique we have proposed. This method 
offers a means of applying the proposed algorithms in cases when g is differcntiablc on C but the gradient 
of g is not necessary Lipschitz continuous on C . By quadratically extending g, the proposed constrained 
minimization algorithms can be used. This extension depends on a parameter 9 which controls the precision 
(closeness to the solution of the original minimization problem) and the convergence speed of the algorithm. 
As illustrated by the simulations, the choice of this parameter should result from a trade-off. The numerical 
results have also shown the efficiency of the proposed methods in deconvolution problems involving a signal- 
dependent Gaussian noise or a Poisson noise. 

Finally, it may be interesting to note that nested iterative algorithms similar to those developed in this 
paper can be used to solve min^g^ / + g + h where TL is a real separable Hilbcrt space, /, g and h are 
functions in To(TL) and g is /3-Lipschitz differcntiablc. 



A Study of Example |2H2 

Let p = proxy-cc and q = proxj +tc x where x G TL. Let g be the convex function defined by (Vy G TL) 

g(y) = i ||y — x\\ 2 + ^y T Ay. Consequently, p = (I + A) x is the minimizer of g on TL, whereas q is the 
minimizer of g on C . Thus, we can write (Vy G TL) g(y) = g(y) + h x where g(y) = i> (y — p) T (I + A)(y — p) 
and h x is a function of x. Then, q also minimizes g on C. 

In the example, we have chosen x = 2(Ai.2, 1 + A2,2) T , which yields p = (0, 2) T and Pc(p) = (0, 1) T . 

Let q = (tt, 1) t . To show that q = q, we have check that q minimizes g on C. A necessary and sufficient 
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condition for the latter property to be satisfied [301 p. 293, Theorem 1.1.1] is that 

(VyeC) (Vg(<?)) T (y-g)>0 
where Vg(q) = (I + A)(q — p) is the gradient of g at q. This is equivalent to prove that 

(V(2/«, y^Y G C) (2tt - Ai, 2 )(y« - tt) + (A 1)27 r - A 2 , 2 - l)(j/ 2 ) - 1) > 0. (51) 
Three cases must be considered: 

• when Ai. 2 < -2, (y (1) ,y' 2 ^) T 6C=> > — 1 = 7T and y (2) < 1. In addition, we have 2-k - Ai )2 = 
-2 - Ai. 2 > and A 2 , 2 - K\ 2 > Ai, 2 tt - A 2 . 2 - 1 < -A 2 2 - A x , 2 - 1 < 0. So, ^ETJ) holds. 

• When Ai >2 > 2, similar arguments hold. 

• When Ai, 2 G [-2, 2], 2n - Ai >2 = and Ai j2 tt - A 2 , 2 - 1 = ^ - A 2 , 2 - 1 < - 1 < 0, which 
shows that (f5Tj) is satisfied. 

This leads to the conclusion of Example 12.31 



B Study of Example [231 

Let / be the function defined in Example [231 Defining the rotation matrix R = ^= (\ , this function 

can be expressed as 

(Vz G R 2 ) f(x) = f(Rx) 

where f(x) = ^x T Ax with 

' 1 A 1:2 N 



A = 
In addition. 



Ai, 2 1 



C = {xeM 2 | Rx€ [-l,l] 2 } = iJ T [-l,l] 2 . 

It can be noticed that [— 1, l] 2 is the separable convex set considered in Example 12.31 whereas / appears as 
a particular case in the class of quadratic functions considered in this example (by setting A 2 2 = 1). 
Thus, the proximity operator of / is 

(Vx G U) proxfx = argmin -\\x - y\\ 2 + f(y) 

yen z 

= argmin — ||.Rx — Ry\\ 2 + f(Ry) = P T prox?(Px). 
yen 2 1 

and Pc(proxy-x) = P T P[_ii]2(Pprox^x) = P T P[_i : i]2 (prox^Px)) . Similarly, we have 

(Vx <G TL) proxy +(c .T = P T proxj +( ^ 2 (R%)- 

So, if x = 2i? T (A 1 , 2 ,2) T = V2(2 + Ai, 2 ,2 - Ai, 2 ) T , we deduce from Example l2~3l that P c (prox / x) = 
^(1, — 1) T and proXf +L/J x = ^=(1 + 7r + 1, 1 — 7r) T , where the expression of tt is given by (jHJ). It can be 
concluded that Pc(proxjx) ^ prox^ +tc a:. 
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